Search for: All records

Creators/Authors contains: "Malin, Bradley A."

« Prev Next »

Total Resources

5

Resource Type
Conference Paper

1

Conference Proceeding

0

Dataset

0

Journal Article

4

Workshop Report

0

Availability
Full Text / Resource Available

5

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Re-identification of individuals in genomic datasets using public face images

https://doi.org/10.1126/sciadv.abg3296

Venkatesaramani, Rajagopal ; Malin, Bradley A. ; Vorobeychik, Yevgeniy ( November 2021 , Science Advances)

Recent studies suggest that genomic data can be matched to images of human faces, raising the concern that genomic data can be re-identified with relative ease. However, such investigations assume access to well-curated images, which are rarely available in practice and challenging to derive from photos not generated in a controlled laboratory setting. In this study, we reconsider re-identification risk and find that, for most individuals, the actual risk posed by linkage attacks to typical face images is substantially smaller than claimed in prior investigations. Moreover, we show that only a small amount of well-calibrated noise, imperceptible to humans, can be added to images to markedly reduce such risk. The results of this investigation create an opportunity to create image filters that enable individuals to have better control over re-identification risk based on linkage.
more » « less
Full Text Available
Dynamically adjusting case reporting policy to maximize privacy and public health utility in the face of a pandemic

https://doi.org/10.1093/jamia/ocac011

Brown, J Thomas ; Yan, Chao ; Xia, Weiyi ; Yin, Zhijun ; Wan, Zhiyu ; Gkoulalas-Divanis, Aris ; Kantarcioglu, Murat ; Malin, Bradley A ( February 2022 , Journal of the American Medical Informatics Association)

Abstract Objective Supporting public health research and the public’s situational awareness during a pandemic requires continuous dissemination of infectious disease surveillance data. Legislation, such as the Health Insurance Portability and Accountability Act of 1996 and recent state-level regulations, permits sharing deidentified person-level data; however, current deidentification approaches are limited. Namely, they are inefficient, relying on retrospective disclosure risk assessments, and do not flex with changes in infection rates or population demographics over time. In this paper, we introduce a framework to dynamically adapt deidentification for near-real time sharing of person-level surveillance data. Materials and Methods The framework leverages a simulation mechanism, capable of application at any geographic level, to forecast the reidentification risk of sharing the data under a wide range of generalization policies. The estimates inform weekly, prospective policy selection to maintain the proportion of records corresponding to a group size less than 11 (PK11) at or below 0.1. Fixing the policy at the start of each week facilitates timely dataset updates and supports sharing granular date information. We use August 2020 through October 2021 case data from Johns Hopkins University and the Centers for Disease Control and Prevention to demonstrate the framework’s effectiveness in maintaining the PK11 threshold of 0.01. Results When sharing COVID-19 county-level case data across all US counties, the framework’s approach meets the threshold for 96.2% of daily data releases, while a policy based on current deidentification techniques meets the threshold for 32.3%. Conclusion Periodically adapting the data publication policies preserves privacy while enhancing public health utility through timely updates and sharing epidemiologically critical features.
more » « less
Full Text Available
To Warn or Not to Warn: Online Signaling in Audit Games

https://doi.org/10.1109/ICDE48307.2020.00048

Yan, Chao ; Xu, Haifeng ; Vorobeychik, Yevgeniy ; Li, Bo ; Fabbri, Daniel ; Malin, Bradley A. ( April 2020 , ICDE 2020)

Full Text Available
Ensuring electronic medical record simulation through better training, modeling, and evaluation

https://doi.org/10.1093/jamia/ocz161

Zhang, Ziqi ; Yan, Chao ; Mesa, Diego A. ; Sun, Jimeng ; Malin, Bradley A. ( October 2019 , Journal of the American Medical Informatics Association)

Abstract Objective
Electronic medical records (EMRs) can support medical research and discovery, but privacy risks limit the sharing of such data on a wide scale. Various approaches have been developed to mitigate risk, including record simulation via generative adversarial networks (GANs). While showing promise in certain application domains, GANs lack a principled approach for EMR data that induces subpar simulation. In this article, we improve EMR simulation through a novel pipeline that (1) enhances the learning model, (2) incorporates evaluation criteria for data utility that informs learning, and (3) refines the training process.
Materials and Methods
We propose a new electronic health record generator using a GAN with a Wasserstein divergence and layer normalization techniques. We designed 2 utility measures to characterize similarity in the structural properties of real and simulated EMRs in the original and latent space, respectively. We applied a filtering strategy to enhance GAN training for low-prevalence clinical concepts. We evaluated the new and existing GANs with utility and privacy measures (membership and disclosure attacks) using billing codes from over 1 million EMRs at Vanderbilt University Medical Center.
Results
The proposed model outperformed the state-of-the-art approaches with significant improvement in retaining the nature of real records, including prediction performance and structural properties, without sacrificing privacy. Additionally, the filtering strategy achieved higher utility when the EMR training dataset was small.
Conclusions
These findings illustrate that EMR simulation through GANs can be substantially improved through more appropriate training, modeling, and evaluation criteria.

more » « less
A clinical risk prediction model to identify patients with hepatorenal syndrome at hospital admission

https://doi.org/10.1111/ijcp.13393

Koola, Jejo D. ; Chen, Guanhua ; Malin, Bradley A. ; Fabbri, Daniel ; Siew, Edward D. ; Ho, Samuel B. ; Patterson, Olga V. ; Matheny, Michael E. ( August 2019 , International Journal of Clinical Practice)